Ground truth bias in external cluster validity indices

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ground truth bias in external cluster validity indices

External cluster validity indices (CVIs) are used to quantify the quality of a clustering by comparing the similarity between the clustering and a ground truth partition. However, some external CVIs show a biased behaviour when selecting the most similar clustering. Users may consequently be misguided by such results. Recognizing and understanding the bias behaviour of CVIs is therefore crucial...

متن کامل

Selection Bias, Label Bias, and Bias in Ground Truth

Language technology is biased toward English newswire. In POS tagging, we get 97–98 words right out of a 100 in English newswire, but results drop to about 8 out of 10 when running the same technology on Twitter data. In dependency parsing, we are able to identify the syntactic head of 9 out of 10 words in English newswire, but only 6–7 out of 10 in tweets. Replace references to Twitter with re...

متن کامل

Online Cluster Validity Indices for Streaming Data

Cluster analysis is used to explore structure in unlabeled data sets in a wide range of applications. An important part of cluster analysis is validating the quality of computationally obtained clusters. A large number of different internal indices have been developed for validation in the offline setting. However, this concept has not been extended to the online setting. A key challenge is to ...

متن کامل

Improving Cluster Method Quality by Validity Indices

Clustering attempts to discover significant groups present in a data set. It is an unsupervised process. It is difficult to define when a clustering result is acceptable. Thus, several clustering validity indices are developed to evaluate the quality of clustering algorithms results. In this paper, we propose to improve the quality of a clustering algorithm called ”CLUSTER” by using a validity ...

متن کامل

An Information-Theoretic External Cluster-Validity Measure

In this paper we propose a measure of sim­ ilarity /association between two partitions of a set of objects. Our motivation is the desire to use the measure to characterize the quality or accuracy of clustering algorithms by some­ how comparing the clusters they produce with "ground truth" consisting of classes as­ signed by manual means or some other means in whose veracity there is confidence....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Pattern Recognition

سال: 2017

ISSN: 0031-3203

DOI: 10.1016/j.patcog.2016.12.003